Add Claude skill to create instrumentations by PerfectSlayer · Pull Request #10774 · DataDog/dd-trace-java

PerfectSlayer · 2026-03-09T16:56:04Z

What Does This Do

This PR introduces a Claude skill to create APM integrations.

Motivation

This is part of the experimentation to get APM Instrumentation Toolkit integration with dd-trace-java.

Additional Notes

I tried to include upgrade and feedback directly from the skill. I expect it to improve itself overtime with usage.

Contributor Checklist

Format the title according to the contribution guidelines
Assign the type: and (comp: or inst:) labels in addition to any other useful labels
Avoid using close, fix, or any linking keywords when referencing an issue
Use solves instead, and assign the PR milestone to the issue
Update the CODEOWNERS file on source file addition, migration, or deletion
Update public documentation with any new configuration flags or behaviors

Jira ticket: [PROJ-IDENT]

Note: Once your PR is ready to merge, add it to the merge queue by commenting /merge. /merge -c cancels the queue request. /merge -f --reason "reason" skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.

pr-commenter · 2026-03-09T17:37:05Z

Benchmarks

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	bbujon/ai-toolkit
git_commit_date	1773366819	1773389033
git_commit_sha	`fd65c0a`	`3cedda9`
release_version	1.61.0-SNAPSHOT~fd65c0aa59	1.61.0-SNAPSHOT~3cedda9dc0

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1773390911	1773390911
ci_job_id	1503261681	1503261681
ci_pipeline_id	102316595	102316595
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-2ezmu32l 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-2ezmu32l 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module	Agent	Agent
parent	None	None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 65 metrics, 6 unstable metrics.

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.61.0-SNAPSHOT~3cedda9dc0, baseline=1.61.0-SNAPSHOT~fd65c0aa59

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.069 s) : 0, 1068682
Total [baseline] (8.858 s) : 0, 8857608
Agent [candidate] (1.073 s) : 0, 1073000
Total [candidate] (8.829 s) : 0, 8828696
section iast
Agent [baseline] (1.225 s) : 0, 1224600
Total [baseline] (9.554 s) : 0, 9554420
Agent [candidate] (1.238 s) : 0, 1237677
Total [candidate] (9.608 s) : 0, 9607984

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.069 s	-
Agent	iast	1.225 s	155.918 ms (14.6%)
Total	tracing	8.858 s	-
Total	iast	9.554 s	696.811 ms (7.9%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.073 s	-
Agent	iast	1.238 s	164.677 ms (15.3%)
Total	tracing	8.829 s	-
Total	iast	9.608 s	779.289 ms (8.8%)

gantt
    title insecure-bank - break down per module: candidate=1.61.0-SNAPSHOT~3cedda9dc0, baseline=1.61.0-SNAPSHOT~fd65c0aa59

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.21 ms) : 0, 1210
crashtracking [candidate] (1.207 ms) : 0, 1207
BytebuddyAgent [baseline] (635.696 ms) : 0, 635696
BytebuddyAgent [candidate] (636.1 ms) : 0, 636100
AgentMeter [baseline] (29.37 ms) : 0, 29370
AgentMeter [candidate] (29.401 ms) : 0, 29401
GlobalTracer [baseline] (258.772 ms) : 0, 258772
GlobalTracer [candidate] (259.708 ms) : 0, 259708
AppSec [baseline] (32.031 ms) : 0, 32031
AppSec [candidate] (32.029 ms) : 0, 32029
Debugger [baseline] (59.507 ms) : 0, 59507
Debugger [candidate] (59.403 ms) : 0, 59403
Remote Config [baseline] (624.973 µs) : 0, 625
Remote Config [candidate] (621.433 µs) : 0, 621
Telemetry [baseline] (8.861 ms) : 0, 8861
Telemetry [candidate] (8.725 ms) : 0, 8725
Flare Poller [baseline] (6.414 ms) : 0, 6414
Flare Poller [candidate] (9.524 ms) : 0, 9524
section iast
crashtracking [baseline] (1.19 ms) : 0, 1190
crashtracking [candidate] (1.201 ms) : 0, 1201
BytebuddyAgent [baseline] (793.497 ms) : 0, 793497
BytebuddyAgent [candidate] (803.739 ms) : 0, 803739
AgentMeter [baseline] (11.311 ms) : 0, 11311
AgentMeter [candidate] (11.651 ms) : 0, 11651
GlobalTracer [baseline] (247.36 ms) : 0, 247360
GlobalTracer [candidate] (249.343 ms) : 0, 249343
IAST [baseline] (25.375 ms) : 0, 25375
IAST [candidate] (25.449 ms) : 0, 25449
AppSec [baseline] (26.623 ms) : 0, 26623
AppSec [candidate] (26.735 ms) : 0, 26735
Debugger [baseline] (63.119 ms) : 0, 63119
Debugger [candidate] (62.945 ms) : 0, 62945
Remote Config [baseline] (515.895 µs) : 0, 516
Remote Config [candidate] (520.781 µs) : 0, 521
Telemetry [baseline] (14.773 ms) : 0, 14773
Telemetry [candidate] (14.993 ms) : 0, 14993
Flare Poller [baseline] (4.849 ms) : 0, 4849
Flare Poller [candidate] (4.887 ms) : 0, 4887

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.61.0-SNAPSHOT~3cedda9dc0, baseline=1.61.0-SNAPSHOT~fd65c0aa59

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.063 s) : 0, 1062725
Total [baseline] (11.137 s) : 0, 11137380
Agent [candidate] (1.058 s) : 0, 1057912
Total [candidate] (11.0 s) : 0, 11000033
section appsec
Agent [baseline] (1.243 s) : 0, 1243315
Total [baseline] (11.169 s) : 0, 11168870
Agent [candidate] (1.244 s) : 0, 1243915
Total [candidate] (11.111 s) : 0, 11111389
section iast
Agent [baseline] (1.229 s) : 0, 1228545
Total [baseline] (11.358 s) : 0, 11357529
Agent [candidate] (1.224 s) : 0, 1223693
Total [candidate] (11.326 s) : 0, 11326220
section profiling
Agent [baseline] (1.179 s) : 0, 1178514
Total [baseline] (10.962 s) : 0, 10961913
Agent [candidate] (1.181 s) : 0, 1180802
Total [candidate] (11.113 s) : 0, 11112522

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.063 s	-
Agent	appsec	1.243 s	180.59 ms (17.0%)
Agent	iast	1.229 s	165.82 ms (15.6%)
Agent	profiling	1.179 s	115.789 ms (10.9%)
Total	tracing	11.137 s	-
Total	appsec	11.169 s	31.49 ms (0.3%)
Total	iast	11.358 s	220.149 ms (2.0%)
Total	profiling	10.962 s	-175.467 ms (-1.6%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.058 s	-
Agent	appsec	1.244 s	186.003 ms (17.6%)
Agent	iast	1.224 s	165.782 ms (15.7%)
Agent	profiling	1.181 s	122.891 ms (11.6%)
Total	tracing	11.0 s	-
Total	appsec	11.111 s	111.356 ms (1.0%)
Total	iast	11.326 s	326.187 ms (3.0%)
Total	profiling	11.113 s	112.489 ms (1.0%)

gantt
    title petclinic - break down per module: candidate=1.61.0-SNAPSHOT~3cedda9dc0, baseline=1.61.0-SNAPSHOT~fd65c0aa59

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.188 ms) : 0, 1188
crashtracking [candidate] (1.219 ms) : 0, 1219
BytebuddyAgent [baseline] (628.989 ms) : 0, 628989
BytebuddyAgent [candidate] (627.661 ms) : 0, 627661
AgentMeter [baseline] (29.583 ms) : 0, 29583
AgentMeter [candidate] (29.09 ms) : 0, 29090
GlobalTracer [baseline] (258.589 ms) : 0, 258589
GlobalTracer [candidate] (256.747 ms) : 0, 256747
AppSec [baseline] (31.82 ms) : 0, 31820
AppSec [candidate] (31.334 ms) : 0, 31334
Debugger [baseline] (59.883 ms) : 0, 59883
Debugger [candidate] (59.47 ms) : 0, 59470
Remote Config [baseline] (616.142 µs) : 0, 616
Remote Config [candidate] (621.171 µs) : 0, 621
Telemetry [baseline] (8.768 ms) : 0, 8768
Telemetry [candidate] (8.692 ms) : 0, 8692
Flare Poller [baseline] (7.308 ms) : 0, 7308
Flare Poller [candidate] (7.117 ms) : 0, 7117
section appsec
crashtracking [baseline] (1.205 ms) : 0, 1205
crashtracking [candidate] (1.186 ms) : 0, 1186
BytebuddyAgent [baseline] (656.683 ms) : 0, 656683
BytebuddyAgent [candidate] (656.33 ms) : 0, 656330
AgentMeter [baseline] (12.027 ms) : 0, 12027
AgentMeter [candidate] (12.054 ms) : 0, 12054
GlobalTracer [baseline] (257.691 ms) : 0, 257691
GlobalTracer [candidate] (258.008 ms) : 0, 258008
IAST [baseline] (23.969 ms) : 0, 23969
IAST [candidate] (23.999 ms) : 0, 23999
AppSec [baseline] (176.813 ms) : 0, 176813
AppSec [candidate] (177.393 ms) : 0, 177393
Debugger [baseline] (65.712 ms) : 0, 65712
Debugger [candidate] (65.724 ms) : 0, 65724
Remote Config [baseline] (563.938 µs) : 0, 564
Remote Config [candidate] (569.765 µs) : 0, 570
Telemetry [baseline] (8.932 ms) : 0, 8932
Telemetry [candidate] (8.927 ms) : 0, 8927
Flare Poller [baseline] (3.628 ms) : 0, 3628
Flare Poller [candidate] (3.62 ms) : 0, 3620
section iast
crashtracking [baseline] (1.202 ms) : 0, 1202
crashtracking [candidate] (1.19 ms) : 0, 1190
BytebuddyAgent [baseline] (797.897 ms) : 0, 797897
BytebuddyAgent [candidate] (793.385 ms) : 0, 793385
AgentMeter [baseline] (11.314 ms) : 0, 11314
AgentMeter [candidate] (11.268 ms) : 0, 11268
GlobalTracer [baseline] (247.073 ms) : 0, 247074
GlobalTracer [candidate] (246.706 ms) : 0, 246706
IAST [baseline] (25.086 ms) : 0, 25086
IAST [candidate] (25.085 ms) : 0, 25085
AppSec [baseline] (26.431 ms) : 0, 26431
AppSec [candidate] (26.405 ms) : 0, 26405
Debugger [baseline] (63.249 ms) : 0, 63249
Debugger [candidate] (64.903 ms) : 0, 64903
Remote Config [baseline] (521.056 µs) : 0, 521
Remote Config [candidate] (514.801 µs) : 0, 515
Telemetry [baseline] (14.863 ms) : 0, 14863
Telemetry [candidate] (13.8 ms) : 0, 13800
Flare Poller [baseline] (4.865 ms) : 0, 4865
Flare Poller [candidate] (4.564 ms) : 0, 4564
section profiling
crashtracking [baseline] (1.168 ms) : 0, 1168
crashtracking [candidate] (1.164 ms) : 0, 1164
BytebuddyAgent [baseline] (680.276 ms) : 0, 680276
BytebuddyAgent [candidate] (680.985 ms) : 0, 680985
AgentMeter [baseline] (8.635 ms) : 0, 8635
AgentMeter [candidate] (8.628 ms) : 0, 8628
GlobalTracer [baseline] (215.035 ms) : 0, 215035
GlobalTracer [candidate] (215.482 ms) : 0, 215482
AppSec [baseline] (31.82 ms) : 0, 31820
AppSec [candidate] (31.943 ms) : 0, 31943
Debugger [baseline] (62.814 ms) : 0, 62814
Debugger [candidate] (65.238 ms) : 0, 65238
Remote Config [baseline] (586.869 µs) : 0, 587
Remote Config [candidate] (588.955 µs) : 0, 589
Telemetry [baseline] (9.777 ms) : 0, 9777
Telemetry [candidate] (8.236 ms) : 0, 8236
Flare Poller [baseline] (4.235 ms) : 0, 4235
Flare Poller [candidate] (3.495 ms) : 0, 3495
ProfilingAgent [baseline] (93.557 ms) : 0, 93557
ProfilingAgent [candidate] (94.249 ms) : 0, 94249
Profiling [baseline] (94.128 ms) : 0, 94128
Profiling [candidate] (94.818 ms) : 0, 94818

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	bbujon/ai-toolkit
git_commit_date	1773366819	1773389033
git_commit_sha	`fd65c0a`	`3cedda9`
release_version	1.61.0-SNAPSHOT~fd65c0aa59	1.61.0-SNAPSHOT~3cedda9dc0

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1773391398	1773391398
ci_job_id	1503261684	1503261684
ci_pipeline_id	102316595	102316595
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-1-twr0o6mn 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-1-twr0o6mn 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 2 performance improvements and 2 performance regressions! Performance is the same for 16 metrics, 16 unstable metrics.

scenario	Δ mean agg_http_req_duration_p50	Δ mean agg_http_req_duration_p95	Δ mean throughput	candidate mean agg_http_req_duration_p50	candidate mean agg_http_req_duration_p95	candidate mean throughput	baseline mean agg_http_req_duration_p50	baseline mean agg_http_req_duration_p95	baseline mean throughput
scenario:load:insecure-bank:iast:high_load	better [-245.931µs; -128.690µs] or [-9.424%; -4.931%]	same [-427.316µs; +23.293µs] or [-5.719%; +0.312%]	unstable [-91.069op/s; +220.444op/s] or [-6.574%; +15.913%]	2.422ms	7.270ms	1450.031op/s	2.610ms	7.472ms	1385.344op/s
scenario:load:petclinic:profiling:high_load	worse [+0.792ms; +1.914ms] or [+4.324%; +10.447%]	worse [+0.990ms; +2.457ms] or [+3.335%; +8.276%]	unstable [-39.620op/s; +8.308op/s] or [-15.820%; +3.317%]	19.675ms	31.413ms	234.781op/s	18.322ms	29.689ms	250.438op/s
scenario:load:petclinic:no_agent:high_load	better [-1.957ms; -0.401ms] or [-10.370%; -2.125%]	unstable [-3.423ms; -0.146ms] or [-10.914%; -0.467%]	unstable [-11.184op/s; +41.684op/s] or [-4.622%; +17.227%]	17.695ms	29.576ms	257.219op/s	18.874ms	31.360ms	241.969op/s

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~3cedda9dc0, baseline=1.61.0-SNAPSHOT~fd65c0aa59
    dateFormat X
    axisFormat %s
section baseline
no_agent (19.291 ms) : 19094, 19488
.   : milestone, 19291,
appsec (19.493 ms) : 19289, 19698
.   : milestone, 19493,
code_origins (17.702 ms) : 17528, 17877
.   : milestone, 17702,
iast (17.85 ms) : 17670, 18030
.   : milestone, 17850,
profiling (18.639 ms) : 18451, 18827
.   : milestone, 18639,
tracing (17.787 ms) : 17610, 17964
.   : milestone, 17787,
section candidate
no_agent (18.143 ms) : 17957, 18329
.   : milestone, 18143,
appsec (19.635 ms) : 19433, 19838
.   : milestone, 19635,
code_origins (17.797 ms) : 17618, 17976
.   : milestone, 17797,
iast (18.261 ms) : 18080, 18443
.   : milestone, 18261,
profiling (19.885 ms) : 19682, 20088
.   : milestone, 19885,
tracing (17.802 ms) : 17623, 17982
.   : milestone, 17802,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	19.291 ms [19.094 ms, 19.488 ms]	-
appsec	19.493 ms [19.289 ms, 19.698 ms]	202.755 µs (1.1%)
code_origins	17.702 ms [17.528 ms, 17.877 ms]	-1.589 ms (-8.2%)
iast	17.85 ms [17.67 ms, 18.03 ms]	-1.441 ms (-7.5%)
profiling	18.639 ms [18.451 ms, 18.827 ms]	-651.856 µs (-3.4%)
tracing	17.787 ms [17.61 ms, 17.964 ms]	-1.504 ms (-7.8%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	18.143 ms [17.957 ms, 18.329 ms]	-
appsec	19.635 ms [19.433 ms, 19.838 ms]	1.493 ms (8.2%)
code_origins	17.797 ms [17.618 ms, 17.976 ms]	-346.063 µs (-1.9%)
iast	18.261 ms [18.08 ms, 18.443 ms]	118.615 µs (0.7%)
profiling	19.885 ms [19.682 ms, 20.088 ms]	1.743 ms (9.6%)
tracing	17.802 ms [17.623 ms, 17.982 ms]	-340.51 µs (-1.9%)

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~3cedda9dc0, baseline=1.61.0-SNAPSHOT~fd65c0aa59
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.194 ms) : 1183, 1206
.   : milestone, 1194,
iast (3.305 ms) : 3259, 3351
.   : milestone, 3305,
iast_FULL (5.7 ms) : 5644, 5757
.   : milestone, 5700,
iast_GLOBAL (3.388 ms) : 3347, 3430
.   : milestone, 3388,
profiling (2.115 ms) : 2095, 2136
.   : milestone, 2115,
tracing (1.758 ms) : 1743, 1773
.   : milestone, 1758,
section candidate
no_agent (1.169 ms) : 1158, 1180
.   : milestone, 1169,
iast (3.155 ms) : 3111, 3199
.   : milestone, 3155,
iast_FULL (5.794 ms) : 5737, 5852
.   : milestone, 5794,
iast_GLOBAL (3.544 ms) : 3483, 3605
.   : milestone, 3544,
profiling (2.176 ms) : 2156, 2196
.   : milestone, 2176,
tracing (1.77 ms) : 1756, 1784
.   : milestone, 1770,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.194 ms [1.183 ms, 1.206 ms]	-
iast	3.305 ms [3.259 ms, 3.351 ms]	2.111 ms (176.8%)
iast_FULL	5.7 ms [5.644 ms, 5.757 ms]	4.506 ms (377.3%)
iast_GLOBAL	3.388 ms [3.347 ms, 3.43 ms]	2.194 ms (183.7%)
profiling	2.115 ms [2.095 ms, 2.136 ms]	921.004 µs (77.1%)
tracing	1.758 ms [1.743 ms, 1.773 ms]	563.315 µs (47.2%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.169 ms [1.158 ms, 1.18 ms]	-
iast	3.155 ms [3.111 ms, 3.199 ms]	1.986 ms (169.8%)
iast_FULL	5.794 ms [5.737 ms, 5.852 ms]	4.625 ms (395.5%)
iast_GLOBAL	3.544 ms [3.483 ms, 3.605 ms]	2.375 ms (203.1%)
profiling	2.176 ms [2.156 ms, 2.196 ms]	1.007 ms (86.1%)
tracing	1.77 ms [1.756 ms, 1.784 ms]	600.506 µs (51.4%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	bbujon/ai-toolkit
git_commit_date	1773366819	1773389033
git_commit_sha	`fd65c0a`	`3cedda9`
release_version	1.61.0-SNAPSHOT~fd65c0aa59	1.61.0-SNAPSHOT~3cedda9dc0

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1773391076	1773391076
ci_job_id	1503261687	1503261687
ci_pipeline_id	102316595	102316595
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-b32iywje 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-b32iywje 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~3cedda9dc0, baseline=1.61.0-SNAPSHOT~fd65c0aa59
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.48 ms) : 1468, 1492
.   : milestone, 1480,
appsec (3.747 ms) : 3531, 3964
.   : milestone, 3747,
iast (2.261 ms) : 2192, 2331
.   : milestone, 2261,
iast_GLOBAL (2.308 ms) : 2238, 2378
.   : milestone, 2308,
profiling (2.108 ms) : 2051, 2164
.   : milestone, 2108,
tracing (2.059 ms) : 2006, 2113
.   : milestone, 2059,
section candidate
no_agent (1.473 ms) : 1462, 1485
.   : milestone, 1473,
appsec (3.823 ms) : 3600, 4045
.   : milestone, 3823,
iast (2.264 ms) : 2195, 2334
.   : milestone, 2264,
iast_GLOBAL (2.296 ms) : 2226, 2365
.   : milestone, 2296,
profiling (2.094 ms) : 2038, 2150
.   : milestone, 2094,
tracing (2.058 ms) : 2005, 2111
.   : milestone, 2058,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.48 ms [1.468 ms, 1.492 ms]	-
appsec	3.747 ms [3.531 ms, 3.964 ms]	2.267 ms (153.2%)
iast	2.261 ms [2.192 ms, 2.331 ms]	781.341 µs (52.8%)
iast_GLOBAL	2.308 ms [2.238 ms, 2.378 ms]	827.78 µs (55.9%)
profiling	2.108 ms [2.051 ms, 2.164 ms]	627.549 µs (42.4%)
tracing	2.059 ms [2.006 ms, 2.113 ms]	579.384 µs (39.1%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.473 ms [1.462 ms, 1.485 ms]	-
appsec	3.823 ms [3.6 ms, 4.045 ms]	2.349 ms (159.5%)
iast	2.264 ms [2.195 ms, 2.334 ms]	791.251 µs (53.7%)
iast_GLOBAL	2.296 ms [2.226 ms, 2.365 ms]	822.557 µs (55.8%)
profiling	2.094 ms [2.038 ms, 2.15 ms]	620.603 µs (42.1%)
tracing	2.058 ms [2.005 ms, 2.111 ms]	584.925 µs (39.7%)

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~3cedda9dc0, baseline=1.61.0-SNAPSHOT~fd65c0aa59
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.521 s) : 15521000, 15521000
.   : milestone, 15521000,
appsec (14.776 s) : 14776000, 14776000
.   : milestone, 14776000,
iast (18.254 s) : 18254000, 18254000
.   : milestone, 18254000,
iast_GLOBAL (17.876 s) : 17876000, 17876000
.   : milestone, 17876000,
profiling (15.008 s) : 15008000, 15008000
.   : milestone, 15008000,
tracing (14.917 s) : 14917000, 14917000
.   : milestone, 14917000,
section candidate
no_agent (15.462 s) : 15462000, 15462000
.   : milestone, 15462000,
appsec (15.09 s) : 15090000, 15090000
.   : milestone, 15090000,
iast (18.213 s) : 18213000, 18213000
.   : milestone, 18213000,
iast_GLOBAL (17.154 s) : 17154000, 17154000
.   : milestone, 17154000,
profiling (15.234 s) : 15234000, 15234000
.   : milestone, 15234000,
tracing (14.926 s) : 14926000, 14926000
.   : milestone, 14926000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.521 s [15.521 s, 15.521 s]	-
appsec	14.776 s [14.776 s, 14.776 s]	-745.0 ms (-4.8%)
iast	18.254 s [18.254 s, 18.254 s]	2.733 s (17.6%)
iast_GLOBAL	17.876 s [17.876 s, 17.876 s]	2.355 s (15.2%)
profiling	15.008 s [15.008 s, 15.008 s]	-513.0 ms (-3.3%)
tracing	14.917 s [14.917 s, 14.917 s]	-604.0 ms (-3.9%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.462 s [15.462 s, 15.462 s]	-
appsec	15.09 s [15.09 s, 15.09 s]	-372.0 ms (-2.4%)
iast	18.213 s [18.213 s, 18.213 s]	2.751 s (17.8%)
iast_GLOBAL	17.154 s [17.154 s, 17.154 s]	1.692 s (10.9%)
profiling	15.234 s [15.234 s, 15.234 s]	-228.0 ms (-1.5%)
tracing	14.926 s [14.926 s, 14.926 s]	-536.0 ms (-3.5%)

wconti27 · 2026-03-10T12:15:16Z

.claude/skills/add-apm-integrations/SKILL.md

+## Step 2 – Clarify the task
+
+If the user has not already provided all of the following, ask before proceeding:
+
+- **Framework name** and **minimum supported version** (e.g. `okhttp-3.0`)
+- **Target class(es) and method(s)** to instrument (fully qualified class names preferred)
+- **Target system**: one of `Tracing`, `Profiling`, `AppSec`, `Iast`, `CiVisibility`, `Usm`, `ContextTracking`
+- **Whether this is a bootstrap instrumentation** (affects allowed imports)


Im curious, genuine question, do you know if the ask a user a question works in the current state of the skill, given AskUserQuestion is not in allowed-tools?

I think if it is not in the allowed tools, it will come down to the security rules, the user allowed tools, and ask to use it otherwise. It’s not "allowed by default" but might be useful to add it nonetheless 🤔 Similarly, it will need web search but I don’t want to enabled it by default for security reasons.

wconti27 · 2026-03-10T12:17:33Z

.claude/skills/apm-integrations/SKILL.md

+1. `docs/how_instrumentations_work.md` — full reference (types, methods, advice, helpers, context stores, decorators)
+2. `docs/add_new_instrumentation.md` — step-by-step walkthrough
+3. `docs/how_to_test.md` — test types and how to run them


Generally for reference files, I advise to use proper markdown linking, it does't help the LLM, but it does help engineers to quickly navigate to the files. Just a suggestion 😄

I advise to use proper markdown linking

So you would use something like that?

1. [docs/how_instrumentations_work.md](full reference (types, methods, advice, helpers, context stores, decorators)) 2. [docs/add_new_instrumentation.md](step-by-step walkthrough) 3. [docs/how_to_test.md](test types and how to run them)

wconti27 · 2026-03-10T12:23:57Z

.claude/skills/apm-integrations/SKILL.md

+
+Before writing any code, read all three files in full:
+
+1. `docs/how_instrumentations_work.md` — full reference (types, methods, advice, helpers, context stores, decorators)


I wonder how it will perform given this reference is almost 1k lines, I'm not sure tbh.

Good question... Not sure either but it is ingesting many documentation files and instrumentations before starting implementation, but it looks like doing it using subagent. So we might be in the clear about context management.

Here is a report about creating (again) the Feign instrumentation:

Direct reads (Read tool) ┌──────────────────────────────────────────────────────────────┬───────────────────────────┐ │ File │ Lines │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/GoogleHttpClientInstrumentation.java │ 121 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/GoogleHttpClientDecorator.java │ 68 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/HeadersInjectAdapter.java │ 16 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/build.gradle │ 21 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/AbstractGoogleHttpClientTest.groovy │ 53 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/GoogleHttpClientTest.groovy │ 21 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ pekko-http-1.0/HttpHeaderSubclassesInstrumentation.java │ 60 (partial) │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ javax-websocket-1.0/SessionInstrumentation.java │ 60 (partial) │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ apache-httpclient-4.0/HelperMethods.java │ 76 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ apache-httpclient-4.0/ApacheHttpClientInstrumentation.java │ 277 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ settings.gradle.kts │ 10 (partial) │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ Total │ 783 lines across 11 files │ └──────────────────────────────────────────────────────────────┴───────────────────────────┘ Via grep/bash (content snippets) - HttpClientDecorator.java — abstract method signatures (~15 lines) - HttpClientTest.groovy — abstract method signatures (~10 lines) - Various directory listings and module path lookups Via subagents (delegated research) - Feign API research agent — 13 tool calls, web searches on Feign's API, class hierarchy, Maven coordinates, and version history - HTTP client patterns agent — 34 tool calls, read OkHttp, Apache HttpClient, and Google HTTP Client instrumentation files in full (~1,500 estimated lines across ~12 files) Summary ┌───────────────────┬───────────┬──────────────┐ │ Source │ Files │ ~Lines │ ├───────────────────┼───────────┼──────────────┤ │ Direct reads │ 11 │ 783 │ ├───────────────────┼───────────┼──────────────┤ │ Subagent reads │ ~12 │ ~1,500 │ ├───────────────────┼───────────┼──────────────┤ │ Web/docs research │ — │ — │ ├───────────────────┼───────────┼──────────────┤ │ Total │ ~23 files │ ~2,300 lines │ └───────────────────┴───────────┴──────────────┘ The subagents did the bulk of the pattern research, freeing the main context for writing the actual implementation.

.claude/skills/add-apm-integrations/SKILL.md

mcculls · 2026-03-10T12:43:34Z

.claude/skills/add-apm-integrations/SKILL.md

+  - `@Advice.Return` — the return value (exit only)
+  - `@Advice.Thrown` — the thrown exception (exit only)
+  - `@Advice.Enter` — the return value of the enter method (exit only)
+- Use `CallDepthThreadLocalMap` to guard against recursive instrumentation of the same method


Add: "- Do not use lambdas in advice methods"

EDIT: this should go in the "Must NOT do" section below...

mcculls · 2026-03-10T12:46:39Z

.claude/skills/add-apm-integrations/SKILL.md

+Enter method:
+1. `AgentSpan span = startSpan(DECORATE.operationName(), ...)`
+2. `DECORATE.afterStart(span)` + set domain-specific tags
+3. `AgentScope scope = activateSpan(span)` — return or store via `@Advice.Local`


Should we push it towards the Context API as that will be preferred going forwards?

ContextScope scope = span.attach()

I think we should revisit our docs (/docs) first, and then reflect the upgrade to the skill. WDYT?
Upgrading the code base would also help as it is heavily reading at the other instrumentations as example as it does not have reference document / codebase.

I added the files it reads to get knowledge to build (again) the Feign instrumentation here: #10774 (comment)
You can see he’s relying on some other instrumentations to know how to proceed. So cleaning up our codebase or providing references to the skills would help better I guess.

jordan-wong · 2026-03-10T13:07:42Z

.claude/skills/add-apm-integrations/SKILL.md

+
+## Step 12 – Retrospective: update this skill with what was learned
+
+After the instrumentation is complete (or abandoned), review the full session and improve this skill for future use.


I haven't seen this type of instruction before and I'm curious how it'll perform.

My one concern with this is that we are instructing it to update the instrumentation with lessons learned before any human review is in the loop, could be too early?

I like the idea though and would like to see it in action, especially as we are in prototyping stages.

My one concern with this is that we are instructing it to update the instrumentation with lessons learned before any human review is in the loop, could be too early?

It's interesting to see the changes it makes according to the instrumentation challenges it faces.
I did not include its discovery and changes so far because it feels too early. Especially without way golden instrumentations and easy way to compare to output.

wconti27 · 2026-03-10T13:43:43Z

.claude/skills/add-apm-integrations/SKILL.md

+- [ ] `settings.gradle.kts` entry added in alphabetical order
+- [ ] `build.gradle` has `compileOnly` deps and `muzzle` directives with `assertInverse = true`
+- [ ] `@AutoService(InstrumenterModule.class)` annotation present on the module class
+- [ ] `helperClassNames()` lists ALL referenced helpers (including inner, anonymous, and enum synthetic classes)
+- [ ] Advice methods are `static` with `@Advice.OnMethodEnter` / `@Advice.OnMethodExit` annotations
+- [ ] `suppress = Throwable.class` on enter/exit (unless the hooked method is a constructor)
+- [ ] No logger field in the Advice class or InstrumenterModule class
+- [ ] No `inline=false` left in production code
+- [ ] No `java.util.logging.*` / `java.nio.*` / `javax.management.*` in bootstrap path
+- [ ] Span lifecycle order is correct: startSpan → afterStart → activateSpan (enter); onError → beforeFinish → finish → close (exit)
+- [ ] Muzzle passes


can we mention the new context API and reference, with notes that the context api must be used and there may be limited examples, and new integrations can be based off of reference integrations, but still should use the new context api.

but still should use the new context api.

For clarification, using the new Context API where an instrumentation is dependent of some other instrumentations using the legacy way may make the generated instrumentation fails. It’s not like always apply it to make it work, it is contextual about how instrumentations interact with each others. And in this case, it feels like the LLM is doing a good job at finding the most relevant / working API to use on average.

wconti27 · 2026-03-10T13:59:58Z

.claude/skills/add-apm-integrations/SKILL.md

+- [ ] Instrumentation tests pass
+- [ ] `latestDepTest` passes
+- [ ] `spotlessCheck` passes


Can we mention the new context API and reference, with notes that the context api must be used and there may be limited examples, and new integrations can be based off of reference integrations, but still should use the new context api.

PerfectSlayer requested a review from a team as a code owner March 9, 2026 16:56

PerfectSlayer added the tag: no release notes Changes to exclude from release notes label Mar 9, 2026

PerfectSlayer requested a review from manuel-alvarez-alvarez March 9, 2026 16:56

PerfectSlayer added tag: experimental Experimental changes tag: ai generated Largely based on code generated by an AI or LLM labels Mar 9, 2026

PerfectSlayer requested review from jordan-wong and wconti27 and removed request for manuel-alvarez-alvarez March 9, 2026 16:57

wconti27 reviewed Mar 10, 2026

View reviewed changes

mcculls reviewed Mar 10, 2026

View reviewed changes

.claude/skills/add-apm-integrations/SKILL.md Show resolved Hide resolved

mcculls reviewed Mar 10, 2026

View reviewed changes

jordan-wong reviewed Mar 10, 2026

View reviewed changes

wconti27 reviewed Mar 10, 2026

View reviewed changes

PerfectSlayer force-pushed the bbujon/ai-toolkit branch from b56d918 to e851657 Compare March 11, 2026 14:27

PerfectSlayer added 4 commits March 13, 2026 09:03

feat(ai): Add skill to create instrumentations

1cd3e0a

feat(ai): Add lambda rule for advice

53871b5

feat(ai): Use markdown link format

2e7e190

feat(ai): Improve skill name and trigger

3cedda9

PerfectSlayer force-pushed the bbujon/ai-toolkit branch from 37136b7 to 3cedda9 Compare March 13, 2026 08:04


		Before writing any code, read all three files in full:

		1. `docs/how_instrumentations_work.md` — full reference (types, methods, advice, helpers, context stores, decorators)


		## Step 12 – Retrospective: update this skill with what was learned

		After the instrumentation is complete (or abandoned), review the full session and improve this skill for future use.

Conversation

PerfectSlayer commented Mar 9, 2026

What Does This Do

Motivation

Additional Notes

Contributor Checklist

Uh oh!

pr-commenter bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mcculls Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pr-commenter bot commented Mar 9, 2026 •

edited

Loading

mcculls Mar 10, 2026 •

edited

Loading